Measuring associational thinking through word embeddings

نویسندگان

چکیده

Abstract The development of a model to quantify semantic similarity and relatedness between words has been the major focus many studies in various fields, e.g. psychology, linguistics, natural language processing. Unlike measures proposed by most previous research, this article is aimed at estimating automatically strength associative that can be semantically related or not. We demonstrate performance depends not only on combination independently constructed word embeddings (namely, corpus- network-based embeddings) but also way these vectors interact. research concludes weighted average cosine-similarity coefficients derived from independent double vector space tends yield high correlations with human judgements. Moreover, we evaluating associations through measure relies rank ordering pairs reveal some findings go unnoticed traditional such as Spearman’s Pearson’s correlation coefficients.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Embeddings through Hellinger PCA

Word embeddings resulting from neural language models have been shown to be successful for a large variety of NLP tasks. However, such architecture might be difficult to train and time-consuming. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. We compare those new word embeddings with the Collobert and Weston (...

متن کامل

Measuring Topic Coherence through Optimal Word Buckets

Measuring topic quality is essential for scoring the learned topics and their subsequent use in Information Retrieval and Text classification. To measure quality of Latent Dirichlet Allocation (LDA) based topics learned from text, we propose a novel approach based on grouping of topic words into buckets (TBuckets). A single large bucket signifies a single coherent theme, in turn indicating high...

متن کامل

Centroid-based Text Summarization through Compositionality of Word Embeddings

The textual similarity is a crucial aspect for many extractive text summarization methods. A bag-of-words representation does not allow to grasp the semantic relationships between concepts when comparing strongly related sentences with no words in common. To overcome this issue, in this paper we propose a centroidbased method for text summarization that exploits the compositional capabilities o...

متن کامل

The Geometry of Culture: Analyzing Meaning through Word Embeddings

We demonstrate the utility of a new methodological tool, neural-network word embedding models, for large-scale text analysis, revealing how these models produce richer insights into cultural associations and categories than possible with prior methods. Word embeddings represent semantic relations between words as geometric relationships between vectors in a high-dimensional space, operationaliz...

متن کامل

What's in an Embedding? Analyzing Word Embeddings through Multilingual Evaluation

In the last two years, there has been a surge of word embedding algorithms and research on them. However, evaluation has mostly been carried out on a narrow set of tasks, mainly word similarity/relatedness and word relation similarity and on a single language, namely English. We propose an approach to evaluate embeddings on a variety of languages that also yields insights into the structure of ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Artificial Intelligence Review

سال: 2021

ISSN: ['0269-2821', '1573-7462']

DOI: https://doi.org/10.1007/s10462-021-10056-6